Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding
نویسندگان
چکیده
This paper presents a new semi-supervised framework with convolutional neural networks (CNNs) for text categorization. Unlike the previous approaches that rely on word embeddings, our method learns embeddings of small text regions from unlabeled data for integration into a supervised CNN. The proposed scheme for embedding learning is based on the idea of two-view semi-supervised learning, which is intended to be useful for the task of interest even though the training is done on unlabeled data. Our models achieve better results than previous approaches on sentiment classification and topic classification tasks.
منابع مشابه
Semi-Supervised Learning with Multi-View Embedding: Theory and Application with Convolutional Neural Networks
This paper presents a theoretical analysis of multi-view embedding – feature embedding that can be learned from unlabeled data through the task of predicting one view from another. We prove its usefulness in supervised learning under certain conditions. The result explains the effectiveness of some existing methods such as word embedding. Based on this theory, we propose a new semi-supervised l...
متن کاملSupervised and Semi-Supervised Text Categorization using LSTM for Region Embeddings
One-hot CNN (convolutional neural network) has been shown to be effective for text categorization (Johnson & Zhang, 2015a;b). We view it as a special case of a general framework which jointly trains a linear model with a non-linear feature generator consisting of ‘text region embedding + pooling’. Under this framework, we explore a more sophisticated region embedding method using Long Short-Ter...
متن کاملSemi-Supervised Classification with Graph Convolutional Networks
We present a scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs. We motivate the choice of our convolutional architecture via a localized first-order approximation of spectral graph convolutions. Our model scales linearly in the number of graph edges and learns hidden lay...
متن کاملEffective Use of Word Order for Text Categorization with Convolutional Neural Networks
Convolutional neural network (CNN) is a neural network that can make use of the internal structure of data such as the 2D structure of image data. This paper studies CNN on text categorization to exploit the 1D structure (namely, word order) of text data for accurate prediction. Instead of using low-dimensional word vectors as input as is often done, we directly apply CNN to high-dimensional te...
متن کاملInterleaved Text/Image Deep Mining on a Very Large-Scale Radiology Database
Despite tremendous progress in computer vision, effective learning on very large-scale (> 100K patients) medical image databases has been vastly hindered. We present an interleaved text/image deep learning system to extract and mine the semantic interactions of radiology images and reports from a national research hospital’s picture archiving and communication system. Instead of using full 3D m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Advances in neural information processing systems
دوره 28 شماره
صفحات -
تاریخ انتشار 2015